Effectiveness of Aggregation Methods in Blog Distillation

نویسندگان

  • Mostafa Keikha
  • Fabio Crestani
چکیده

This paper addresses the blog distillation problem, that is, given a user query find the blogs that are most related to the query topic. We model each post as evidence of the relevance of a blog to the query, and use aggregation methods like Ordered Weighted Averaging operators to combine the evidence. We show that using only highly relevant evidence (posts) for each blog can result in an effective retrieval system. We implement our methods on TREC’06 blog collection with two standard query sets of TREC’07 and TREC’08. Our experiments on the TREC’07 query set show 35% improvement in Mean Average Precision and 22% improvement in Precision@10 over the best applied fusion method to blog distillation. Similar results have been obtained on TREC’08 query set where we have 31% improvement in Mean Average Precision and 20% improvement in Precision@10 over the baseline.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FEUP at TREC 2008 Blog Track: Using Temporal Evidence for Ranking and Feed Distillation

This paper presents the participation of FEUP, from University of Porto, in the TREC 2008 Blog Track. FEUP participated in two tasks, the baseline adhoc retrieval task and the blog finding distillation task. Our approach was focused on the use of the temporal information available in the TREC Blog06 collection. For the baseline adhoc retrieval task a simple temporal sort was evaluated. In the b...

متن کامل

Faceted Blog Distillation System: Find an in-Depth Blog

With the increasing of blog users, the traditional blog search can no longer meet their demands. More work should be done to accommodate the need of finding good blogs to read, besides the topicrelevant blogs. This paper focuses on the problem of an in-depth faceted blog distillation for addressing the quality aspect of the retrieval blogs. We propose a novel L-Qtf coefficient and LQE model to ...

متن کامل

Linguistic aggregation methods in blog retrieval

This paper addresses the blog distillation problem, that is, given a user query find the blogs that are most related to the query topic. We model each post as evidence of the relevance of a blog to the query, and use aggregation methods like Ordered Weighted Averaging (OWA) operators to combine the evidence. We show that using only highly relevant evidence (posts) for each blog can result in an...

متن کامل

Cross-Lingual Blog Analysis based on Multilingual Blog Distillation from Multilingual Wikipedia Entries

The goal of this paper is to cross-lingually analyze multilingual blogs collected with a topic keyword. The framework of collecting multilingual blogs with a topic keyword is designed as the blog distillation (feed search) procedure. Mulitlingual queries for retrieving blog feeds are created fromWikipedia entries. Finally, we cross-lingually and crossculturally compare less well known facts and...

متن کامل

HIT_LTRC at TREC 2010 Blog Track: Faceted Blog Distillation

This paper describes our participation in the faceted blog distillation task at Blog Track 2010. In our approach, indri toolkit is applied for basic topic relevance retrieval. Then the Maximum Entropy (ME) model is adopted to judge the relevance of each blog to specified facet. Feed faceted relevance is calculated by integrating the average relevance of all blogs within a feed and the average r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009